Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use max_cores in taskQueue instead of system cores #2038

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kmavrommatis
Copy link

The class TascQueue accepts an argument thread_count that is used for the max size of a queue.
In the MultithreadedJobExecutor class there is a variable defined (max_cores) that is getting its value from the available cores of the machine. Further in the same class when TaskQueue is called instead of using the max_cores it is using psutil.cpu_count().
I propose to use the max_cores variable when initializing the TaskQueue.

Use case. one can override the max_cores in the MutlithreadedJobExecutor after initialization and force cwltool to use only up to a certain number of cores.

The class TascQueue accepts an argument `thread_count` that is used for the max size of a queue.

In the `MultithreadedJobExecutor `class there is a variable defined (`max_cores`) that is getting its value from the available cores of the machine. Further down in the same class when TaskQueue is called instead of using the `max_cores` it is using `psutil.cpu_count()`. 

I suggest to use  the self.max_cores in the call of TaskQueue in the file executor.py instead of psutil.cpu_count()

Use case:
when a job executor is setup as MultithreadedJobExecutor one can override the max_cores after the initialization of the object and limit the use to the specified cores.

Additional enhancement would be to include an argument to allow the use to provide the number of cores available for use
@mr-c mr-c requested a review from tetron September 6, 2024 19:20
@kmavrommatis
Copy link
Author

Hello, is there any update on this?
Does this modification work - to your knowledge - with the overall architecture of the tool?

@tetron
Copy link
Member

tetron commented Oct 23, 2024

hi @kmavrommatis, I'm taking a look at it.

It should already be the case that it will not start more than max_cores worth of work, because MultithreadedJobExecutor separately accounts for resource (cores/RAM) usage when deciding whether to put a job into the task queue. So if max_cores < cpu_count() I don't think it would make a difference in how many parallel jobs it starts. Although if you are seeing something different happen that your pull request addresses, please provide more information.

On the other hand if you want to start more than cpu_count() worth of work (i.e. max_cores > cpu_count()), then the number of parallel processes would be limited by the thread count being set to cpu_count() instead of the greater value of max_cores.

That said, it would be a useful feature to add command line options like --max-cores and --max-ram that would let you set your own limits instead of the defaults that are chosen for you. Is that the feature you are looking for?

@kmavrommatis
Copy link
Author

Hi @tetron
my use case requires to use more cores than the ones that are physically available on the system (because the tasks running in the pipeline use the CPU very sparingly while waiting for network signals).
I have modified cwltool to allow to specify the number of max cores (as you suggest) in both places in the TaskQueue and MultithreadedJobExecutor to a number that is higher than the actual cpu count (in my typical case it is 256 on a 32 core instance.
Adding the arguments you suggest would work - provided that it is allowed to specify higher number of cores than the ones physically available to the instance :)
Thanks

@tetron
Copy link
Member

tetron commented Nov 5, 2024

@kmavrommatis just to confirm, are you requesting a fractional core in your CWL file? e.g. minCores: 0.125 ? Then it would just need to no longer limit the number of processes it launches to cpu_count().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants